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The Parity Source Coder is a protocol for data compression which is based on a set of parity checks 
organized in a sparse random network. We consider here the case of memory less unbiased binary 
sources. We show that the theoretical capacity saturate the Shannon limit at large K. We also find 
that the first corrections to the leading behavior are exponentially small, so that the behavior at 
finite K is very close to the optimal one. 

I. INTRODUCTION 

The Parity Source Coder (PSC) is a new scheme for lossy data compression, which uses a kind of dual approach pj 
to the LDPC codes used in channel coding 2\. It has been introduced in and discussed recently in £| and We 
discuss here its theoretical performances. 

The idea of the PSC is to use the M bits xjv/ = {xi, . . . %} that we want to compress to build M parity-checks 
on a low-density graph involving N(< M) boolean variables yjy = {yi, ■ ■ - Un}- From the theoretical point of view 
we will be interested in the 'thermodynamic' limit where N and M go to infinity while the rate R = N/M is kept 
fixed. The topology is defined as follows: Each constraint is connected to exactly K variables chosen at random. This 
implies that the probability distribution of the variable connectivity is Poissonian (as in Erdos-Renyi random graphs) 
with mean Ka. This is the general setting for a number of constraint satisfaction problems [||. In our case such a 
graph (cfr. Fig. [JJ defines a set of M linear equations for the N variables: 

Vq + Vil + ■ ■ ■ + Vi a K = x a mod 2 , a = 1, . . . M, (1) 

where Xj, yi € {0, 1}, and the indices i", i%, ■ ■ . , i^-are chosen in {1, . . . , N} with uniform distribution (the repetition 
of two indices in the same constraint can be forbidden, but this is irrelevant in the large N limit which interests us 
here). This problem is called A-XORSAT and it has been recently studied in || and 0- It is also a diluted 
version of the p-spin model used in spin glass theory [ljj. Here we use it to set up a data compressor, following Q. 
The encoded word corresponds to the solution of the linear system which minimizes the number of errors. In the 
thermodynamic limit, it has been shown that the critical value a c that signals the A'-XORSAT problem has a phase 
transition at a critical value a c of the ratio a = M/N. For a < a c a random instance is satisfiablc (in the sense that 
there exists an assignment of the N variables satisfying all M equations) with probability one. This is the SAT phase. 
For a > a c a random instance is unsatisfiable with probability one: there is no assignment satisfying all constraints. 
The critical density of constraints a c increases with K and goes exponentially fast to 1 as A increases (Fig. as 
can be computed using the formalism introduced in HS|. The A-XORSAT can be used for data compression by 
working in the UNSAT phase with a > 1. As the encoding step xjv/ yjv consists in finding the string y^r which 
violates the smallest number of constraints in the compression rate is R = 1/a. Once we have the encoded word, 
the decompression step y^r — > x^ is done by setting x* = or 1 according to eq. . The distortion is defined as the 
number of bits which are not properly recovered, divided by the total number of bits M. We can look at the problem 
in terms of a "cost" function e a {yii ■ ■ ■ yi a K |x a ) which is if eq. QJ is verified and 2 otherwise. The total cost E of the 
compression process is then twice the total number of unsatisfied equations in the linear system ifTTp . The distortion 
is related to it by 

^ = - = — • (2) 

We consider here the simplest version of the lossy compression problem: We deal with uncorrelated unbiased bin ary 
sources, i.e. prob(xi, . . ■XAf)=rj a=1 M prob(x a ) and prob(x a =0)=prob(x a =l)=l/2. The rate distortion theorem [TlJ 
states that a distortion D can be achieved if and only if the rate is large enough, R> R*, where the Shannon bound 
R* is given by 

R* = 1-H 2 (D) , 

and H 2 (x) = — xlogx — (1 — x) log(l — x) is the binary entropy. Basically the proof of achievability in this theorem 
relies on a choice of codewords (the set of all possible encoded words) which is a random set. This is intimately related 
to the random energy model (REM) \vz}. On the other hand, our PSC can be argued to become a random energy 
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FIG. 1: A Tanner graph for a PSC with M = 7 checks and N — 4 variables. In this example the string to be compressed is 
{xi,X2, . . .x-j} = 1001101. The constraints xx, Xa, Xs, %7 impose the sum of the variables yi involved in each constraint to be 1 
mod 2, while 2:2,3:3,3:6 require that the variables add up to mod 2. 



model in the large K limit, in the same way as the p-spin models becomes a REM in the large p limit 0,^3- Seen 
from this point of view, it is not surprising that the performances of the PSC converge to the Shannon bound in the 
limit of large K, as we shall prove here. In fact the same optimal performance has been found in a recent work [T^j 
using a a non-monotonic perceptron. Again in such a device each bit of the decoded word is chosen to be a function 
of the complete encoded word, which is the same as letting K = N, i.e. infinity in the thermodynamic limit, in our 
language. 

However all these "optimal" source coding devices, based either on a random codebook like in the REM, on a 
fully connected perceptron, or on the PSC at K — > 00, have a serious drawback: there is no known fast algorithm 
to perform the encoding. Physically, the encoding step is a search of the ground state, the one which minimizes the 
number of violated constraints. This has to take place in the UNSAT phase a > 1 where these systems are frustrated. 
Finding the exact ground state is an NP-complete problem, but it turns out that we don't even have good heuristics to 
find approximate ground states. Such a heuristic of course cannot exist for the REM, but one could hope to find one 
for the PSC with finite K. For instance in the related problem of if-satisfiability ^[|, or source coding devices based 
on random nodes Q, there exist good heuristics based on the message passing "survey propagation" (SP) algorithm 
which can be seen as a generalization of the celebrated 'belief propagation' algorithm [l|lll3- While this algorithm, 
as such, does not work for the PSC, it seems pos sible that one could develop powerful algorithms for the finite- -FT PSC 
in the future. Actually, a very recent work |l8j proposes a message passing algorithm, inspired by SP, which seems 
to show very good performance. This motivates the present study of the theoretical capacity of the PSC at finite K. 

In this note we compute explicitely the distortion of the PSC in the limit where the clause connectivity K becomes 
large. We first show that for K — ► 00 the distortion becomes optimal (it saturates the Shannon bound). As for the 
finite K corrections, we find that, for a given value of the rate R = 1/a, the distortion is 



D = D 



Sh 



ay Ke 



— KA 



(1 + 0(1/K)) 



(3) 



where Dsh satisfies 1 — i?2(-Ds/i) = 1/a and the coefficients a and A depend on a. In particular, the actual A lies 
in [log 2, 1], and goes to log 2 in the large K limit. The fact that the first finite- if corrections are exponentially small 
must be stressed: This means that also a parity source coder with K — 5 or 6 is in practice nearly optimal. A good 
encoding algorithm for this case could thus turn this PSC into a very good compressor. We stress that the range of 
validity of the result of this paper is limited to the case of uncorrelated sources. This is confirmed by the statistical 
description of a family of code ensembles presented in [l9l | . On the other hand, the hypothesis of a non-biased input 
message does not seem to play a role. 

As we mentioned previously, a protocol very similar to this PSC (the only difference being the underlying graph 
topology) has been introduced in and Murayama [2(j has shown that some belief-propagation based algorithm 
can be used for encoding in the K = 2 case. Our result shows that the optimal capacity (i.e. Shannon's bound) can 
be obtained only in the limit of large K, at variance with some of the statements in It gives the analog, for source 
coding, to the result of Kabashima and Saad [2l| on channel capacity of error-correcting codes at large K. 
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FIG. 2: The critical value of the control parameter marking the transition between SAT/UNSAT is plotted versus K. Follow- 
ing 0, one can show that the leading behavior at large K is a c (K) = 1 - e~ K - (K 2 - K/2)e~ 2K + 0(K 5 e' 3K ) . 



II. CAVITY EQUATIONS 

In order to deal with the AT-XORSAT problem we take advantage of the cavity method as explained in [T^j. This 
method is heuristic (the main assumptions that can be checked self-consistently) but it is believed to be exact. As 
for the A'-XORSAT problem, its range of validity has been rigorously established in and jS^. In particular, the 
cavity result for the critical threshold a c is exact. For a > a c (the regime where we use it) this method finds the 
correct ground-state energy up to a threshold value ac, which is ~ 3.07 for K = 3 [2^ and increases with K as one 
can see from numerics. 

For the sake of simplicity, we pass from boolean variables to Ising spins, thus taking values in {—1, +1}. The general 
idea behind the cavity approach is summarized in Fig. [21 Since the local structure of the random graph is tree-like, 
we focus on a single clause and look at the variables connected to it. We introduce two types of messages, cavity 
biases u a ^i going from clause a to variable i, and cavity fields hi^ a going from variable i to clause a. A cavity bias 
can be (which means that, as for the clause a, variable i is free to assume any value), or ±1 (meaning that this 
is the value that i should take in order to satisfy clause a). The message sent from clause a must take into account 
all the other variables connected to it; each of these sends to a a cavity field which is nothing but the sum of all the 
other incoming cavity biases: hj—> a = Tlibej-a u b^j- I n the most general case, the space of low-energy configurations 
is broken into many disconnected components (clustering). The general object we need to deal with this is then a 
functional distribution Q[q(u)] giving the probability that, if one link a — > i is chosen at random, the probability (with 
respect to the choice of the cluster) of observing a bias u a ^i is q a ^i(u a ~>i)- The same holds for the distribution of 
cavity fields, V[p{h)]. We thus suppose to have a population of q(u)'s and pQij's. In order to simplify the notations, 
we shall simply call uo the bias on variable 0, with no regards about the clause it is coming from. According to 0, 
we iterate the following self-consistent equations: 



qo(wo) 



J2 p^ ) (h 1 )---pf K K ^(h (K _ 1) )6\u,S(Jh 1 --- V-i))J . withprob. Y[f Ka (pi) (4) 

hi,.. -h(K _i) 



K-X 



i=l 



U\,...u v \ a=l / a=l a=l 

Y qi( w i)---q P ( u p) cx PS2/|X! Ma | ~y^2\ Ua \ \ ■ 

i,... u p ^ a — 1 a— 1 * 



(5) 
(6) 



Here S(x) = sign(a;) for x ^ 0, 5(0) = 0, and fKa(') is the Poisson distribution with mean Ka. The first of these 
equations is the direct implementation of the recursion illustrated in Fig. |3J The delta function ensures that clause 
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FIG. 3: The iterative idea behind the cavity equations is illustrated here for K — 5. 



a sends the proper value to variable 0. In the second equation, a reweighting term is present [l^. This is due 
to the fact that if we add one variable and want to compute the new probability distributions at a given value of 
the energy E, then we need all the contributions from the states at energy E — AE, where AE is the energy shift 
caused by the addition of one variable. If the number of clusters at energy E is exp(NH{E / N)) , then the expansion 
T,(E) ~ Y±(E— AE) — yAE leads to a reweighting exp(— yAE), with y = dY,JdE. The knowledge of these distributions 
allows to compute the free energy 3>(y): 



Hv) = $i(y) - (K - l)a<f> 2 (y) 
1- 



= log A(p)(y) 

V 



$ 2 (y) = --logVq(u;{pi}) Vp(p)(/ l )e^(l«+' l l 
V „ u 



\u\-\h\) 



(7) 

(8) 

(9) 



where the average is taken over the random graph ensemble and over the population of the distributions q(u)'s and 
p(/i)'s. The free energy in (JJJ is obtained by adding one variable (and a certain number of clauses) to a system with N 
variables and computing the contribution arising from the corresponding shift in energy, exp(— y$i) = (exp(— yAE)). 
The correction term is due to the fact that in the (iV + 1)— variables system the probability of generating the clauses is 
slightly lower thus we have to cancel a fraction of them at random (see for a detailed derivation). The ground-state 
energy is then evaluated as the minj, $(?/). 

Actually, the nature of messages allow for a simplification of the cavity equations: We write 



q(u) = r/S ufl 



n 



(10) 



Also, it should be clear that, as for the p(h), what matters is only the sign of the field h, then: 



P {p) (h) = 



AW 



(11) 



with A — wo + w+ + w_ and w + = ?«_ because of the up-down symmetry of the problem. In practice, one needs to 
work with a single population of real numbers rji, that leads to a stationary distribution p(rj). For any fixed value of 
y, the self-consistent equations I@JE1|SJ) are solved as follows: 

1. Consider a population of r\i randomly distributed in [0, 1]. 

2. Do K - 1 times: 



Pick a random integer p with probability fxa{p)- 



5 



• Choose p values r)\,...i) p and compute a probability distribution p(h) according to (JSJ. Given this 
amounts to computing two real numbers: Wq and the normalization A. 

• Compute $1 as in (JHJ through this A. 

3. Using these K — 1 distributions p(/i)'s, compute a new q(w) according to (@J. Given H10J) this is the same as 
computing a new value 770 . 

4. Use this new q(w) and a new extracted p(h) to compute $2 as in 0. The total free energy can now be evaluated 
via 0. 

5. Replace an 7/ value randomly chosen in the distribtion with the new value 770 ■ 

6. Go to step 2 until a stationary distribution p{r)) is reached. (The free energy attains then a stationary value.) 

We are now going to discuss the cavity equations for large K and we will use the algorithm we have just described 
to check numerically our asymptotic results. 



III. THE SHANNON BOUND 



The cavity equations mSHHJ have been discussed in || mainly concerning the value of a c (K) and the behavior of 
the ground state energy E (K) close to a c (K). We want to compute E {K) at any a in the large-if limit. 

For large K, there is a self-consistent solution of the cavity equations such that all the wq are very small, in fact 
exponentially small. We just need to assume that the typical value of a wq is much smaller than l/K. This condition 
on wq's shows that 77 is zero to leading order, because from eq. (0J one finds that 



K-l 



n 



no 



(12) 



We shall be more precise below as we verify sclf-consistently the assumption on w$ and will be able to compute the 
first non-zero term. Here we work directly with 77 = 0. We need to compute the new value of wq and w+ using eq. JSJ. 
If K is large, p is generically large (it is Poisson distributed with mean Ka). If p is even (the case of p odd is an 
immediate generalization) one finds: 



w 



w 



(/<) 


C/2) 


e -py 









(p) 
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= -Y 

q=0 


(:) 


(p) 


(P) 

= w+ . 





2e- py 
\J2np ' 

m ^ P 
2p 



1/2 



r/.r 



^27rpa;(l - x) 



cxp 



|p (— xlogx — (1 — x) log(l — x) — xy) I 



(13) 

(14) 
(15) 



The integral can be evaluated for p large by the saddle point method (the saddle point being x* = —77 + log(l + e v )) 
and we have 



(v) 



1 



-2y\ P 
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Since for any finite 77 this is exponentially larger than wq, the leading term in the normalization constant is just 

Now, it is not difficult to show that eq. I© can be rewritten as 

1 



(16) 



-log 



/A(p+ 1 )(t/) 



V 6 V A(v){y) 

and thus the free-energy can be computed from the normalization l|l(j|) alone. We find that 

' 1 + er 2 y 



Hy) = - 



1 



— /A(p+ 1 )(t/) 

logA(p)( y )-(K-l)«log(^^M 



log 2 + a log 



(17) 



$00(2/) • (18) 
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The ground state energy is the maximum of <&(y) and, according to cq. J2J). this gives a distortion D for the parity 
source coder at large K 



D = — max $00(2/) 
la v 



(19) 



The Shannon bound says that the minimum distortion satisfies 1 — H2{Dsh) = I /a. A few lines of computation show 
that the distortion in (|19|) actually saturates the Shannon bound. Let's call z the value of y where $00 (y) is maximal. 
It satisfies: 



Then one gets 



goo(f) 

2a 



log 2 = a (z tanh z — log cosh z 
'$00(2) 



o2z 



1 



Ho 



2a 



1 

log2 



log- 



,2z 



- 2z- 



e 2z + 1 



After some algebra one can derive from this the sccked result: 



(20) 



(21) 



(22) 



This shows that at very large K the XORSAT problem gives exactly the Shannon limit. We now look at finite- if 
corrections in order to see how this asymptotic performance is reached. 



IV. CORRECTIONS 



In order to compute the first order corrections to the leading behavior we compute the normalization constant in © 
under the hypothesis of small (but finite) ?y: 



av>{ V ) = n 
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p \ P y\p-2q\ 
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n 
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-(p- 
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a=l 



~(p-2)j/ 

—^^(^(pr/) 2 + . . . (23) 



9p(y) = E 1 
9=0 



p \f,v\p-2q\ 



(24) 



As we have shown above, the whole free energy can be computed from the knowledge of A^ (y). In order to calculate 
it, we compute the function g p (y) in the large p limit. We first notice that it can be written as 



9 P (y) = E exp 



y|E CT * 



where <Ji are Ising spins. Thus 

9 P 



2(2 cosh y) p 



We use a Fourier transformation to express g p (—y): 



g P {-y) 



y f dk 



7r J k 2 + y 2 



Ay (p\ e -i2qk = y_ 2 p f 
9=0 n J 



y op f dk(cosk)P = y_ 
k 2 + y 2 n 



The sum can be done exactly and we have 



yp even 



■w 



+1 



1 



>2irp tanh y 



(i-i/4p + 0(l/p 2 )) , g po dd(-y) 



2 p+1 1 
y/2irp sinh(y) 
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(25) 



(26) 



7 ) /.' 
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(vrn) 2 + y 2 



(l-l/4 P + 0(l/p 2 )) . (27) 
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Using H26f) , we get for p even 

g p (y) = 2P +1 (cosh y y 



1 ^-\(i-± + o 

•y/27rp(tanh y) \ 4p 



with the replacement tanh?/ — > sinhy if p is odd. To the leading order we have thus 

■i + e- 2 y\ p 



A^(y) = 2 



(l + 0{p'e- p )) 



(28) 



(29) 



with some exponent 7 which depends on the actual order of magnitude of ?y. To compute it we first need to know 
the weight for h = 0. If p is even we use eq. (J5J and we note that the main contribution (in the same hypothesis of 77 
small) is given by 



■w, 



(p even) 



1 



IJ(i-„ o ) + 0(p,) 



AiP)(y) \p/2j2Pl = L 

1 / 1 + e- 2 ^ ~ p 

2 V 2 
(coshj/) _p 



)" (l + 0(! ,^))e-»^(l-i + (l))l(l 



I2ixp 



1 - l/Ap + 0{l/p 2 ) + 0{pri)) 



(30) 



(Here we have also assumed that p > 0, since wo = 1 if p = 0.) On the other hand, if p is odd we have 

p-i 
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A(p)(y) 
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P- 1 \ 1 



(p-l)/2; 2? 



a=l 



|-^(coshy)^(l + 0(l/p)) 
To the leading order, 77 does not fluctuate and takes the value 

K-l 

>*) _ 



77 ~ - log(l - 77) ~ J2 w o = ( K - !) e_ifa + e _Ka (cosh(Aa) - l)w { 
1=1 



(p even) 




-Ka 



sinh(Aa)?yjQ P odd ^ 



K - 1 



(cosh y) p 
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+ 0(Ke- Ka ) + 0(if) , 



p even>0 



since the two other terms (p = and p odd) arc exponentially sublcading. In order to perform this average we use 

1 1 



dt t z - x e-P l 



p z T(z) 

to express the denominator. This allows to perform the average over p even. We then have 



(31) 
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where /3 = Aa/ coshy. We then set t = r//3 and expand in 1/(3. This gives 
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which shows a posteriori that the small 77 hypothesis is consistent. We now go back to (|23|l and get: 
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From this result and from eq. © one finds that 



*2(V) 



log 



1 + er 2 y 



Moreover, 



= — log 2 + Analog 
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(34) 



(35) 



We can now compute the total free energy J7J. One can check directly that the leading corrections to the infinite K 



limit, of order O 



K 3 ' 2 exp ( - Ka(l - 1 / coshy)) 



vanish. We arc then left with 
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(36) 



where limx-+oo A$x = 0. We assume that the maximum of the $(y) in l|36|) is at y = z + e, where e is exponentially 
small at large K (we shall verify self-consistently this hypothesis) and z is the solution of eq. (|20fl . The condition 
$'(y) = then results in 



= o ( K~ 1,2 e~ Ka( - x ' 1/ cosiL ^ 



(37) 



where the dependence of z on a is extracted from (|2l)|l . One finds that z is a monotonic decreasing function. In 
particular, z ~ -y/21og2/a at large a while z diverges as (— l/2)log(a— 1) as a — * 1: It follows that e is exponentially 
small in any case. Coming back to eq. 119|) , it is then easy to see that to the leading order 



D = — ($ooM + A*jc(j?)) = D Sh + C K (a) , 
2a 



where the corrections Ck (&) are finally 



C K (a) 



^jgltv.rw - (. + 1 (Sff - .) + c (i,) ) (1 + c (# 



/2 e~ Ka 



(38) 



(39) 



z being the solution of eq. P0|l. 

We now look at numerical data in order to verify our analytical prediction. In Fig. 0] we plot the difference between 
the actual distortion of the PSC as obtained from the numerical solution of the cavity equations at a = 1.3 and 
the corresponding Shannon value. The curve is the theoretical prediction in (|39|l . where we neglected the 1/K 2 
corrections. The same plot but for a = 2 is shown in Fig. |3J In both cases there is a very good agreement with the 
analytical prediction. 



V. CONCLUSIONS 



We have shown that the theoretical capacity of the Parity Source Coder is optimal at large K and that the 
corrections to the leading behaviour are exponentially small. Nevertheless, due to the smallness of A (cfr. Fig. EJ, 
the exponential decreases quite slowly, and 1/K corrections are needed to take into account the deviations from the 
leading behavior at relatively small values of K. 
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FIG. 4: Theoretical capacity of the PSC, a = 1.3. 
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FIG. 5: Theoretical capacity of the PSC, a = 2.0. 
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